DNA Methylation Linuron Multigenerational Xenopus tropicalis

F2 Generation

Abstract

Methods

The experiments were conducted at the Evolutionary Biology Center, Uppsala University, with Xenopus tropicalis frogs originated from Xenopus One (Dexter, MI, USA). This study is a continuation of (Orton et al., 2018) and (Karlsson et al., 2021). Briefly, in the F0 generation, free-swimming tadpoles (NF stage 40, (Faber and Nieuwkoop, 1994)) were exposed to 45 μg/L of linuron (99.9% purity; CAS 330–55-2, Sigma-Aldrich, St. Louis, MO, USA) dissolved in 0.0008% acetone or to acetone only (Control group) until they reached metamorphosis (NF 66). The exposure was conducted with three tanks per treatment, using a semi-static system where half of the water was changed three times per week, photoperiod 12:12h and 26°C. Linuron concentrations were analysed using gas chromatography/mass spectroscopy (GCMS) throughout the experiment and water quality was verified every other week (Orton et al., 2018). After the exposure period, the animals were transferred to clean water and when they reached sexual maturity, exposed males were mated with naïve females to obtain the F1 generation and follow the male-transmitted effects of linuron. At 20 months of age, F1 males were mated with naïve females to obtain F2 generation, allowing the study of transgenerational effects (Karlsson et al., 2021). The experiments were approved by the Uppsala Animal Ethical Committee and in accordance with the Swedish Legislation on Animal Experimentation (Animal Welfare Act SFS1998:56) and the European Union Directive on the Protection of Animals Used for Scientific Purposes (2010/63/EU).

Dissection

The frogs were anaesthetized in buffered tricaine (tricainemethane sulfonate 0.3%, pH 7; Sigma-Aldrich, St. Louis, MO, USA) and euthanised by decapitation. Brain and the right pancreas were collected and stored at -80 °C until analysis.

DNA extraction, library preparation and sequencing

Genomic DNA from brain and pancreas samples from 7 control and 9 treatment individuals in the F2 generation were extracted using a Qiagen Allprep DNA/RNA mini kit (Qiagen, Germany) according to the manufacturer’s protocol with an additional wash with AW2 buffer. DNA quality was evaluated using NanoDrop ND-1000 spectrometer (Thermo Fisher Scientific, United States) and agarose gel electrophoresis (Bio-Rad Laboratories, US). DNA concentration was measured using Qubit 4 fluorometer with double-stranded DNA (dsDNA) Broad Range (BR) Assay (Thermo Fisher Scientific, United States). One pancreas control sample with low concentration and low 230/260nm absorbance ratio value was excluded from sequencing. Library preparation and RRBS was performed by the Bioinformatics and Expression Analysis core facility (BEA, Karolinska Institute, Sweden) using Diagenode Premium RRBS kit (Diagenode, Belgium) and sequenced on the Illumina Nextseq 2000 platform (Illumina, United States). Average sequencing read size was 44.18 ± 16.10 (mean ± s.d.) millions of single-end reads. The data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB59360 (https://www.ebi.ac.uk/ena/browser/view/PRJEB59360).

Bioinformatics analysis

Sequencing data were processed using the Nextflow nf-core methylseq pipeline version 1.6.1 (Ewels et al., 2021, 2020). The reads were aligned to the X. tropicalis reference genome (UCB v. 10.0) using Bismark v. 0.23.0, with an average of 41.7% ± 4.5% reads uniquely aligned. Sequencing quality was verified by visualizing FastQC v0.11.9 reports.

Hypothesis

Increased methylation levels on CpG islands on promoter regions are generally associated with repression of RNA expression. We hypothesise that differential methylatation will be found on regions related to genes associated with metabolic pathways and gonadal germ cell development.

Pancreas

# Define the list containing the bismark coverage files.
file.names <- list.files("C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/", full.names = TRUE)
file.names2 <- as.list(file.names)

# Sorting files order
pancreas <- file.names2[grep("_P_", file.names2)]  
pancreas <- as.list(unlist(pancreas)[order(substr(unlist(pancreas), start = 118, stop = max(nchar(unlist(pancreas)))))])
pancreas
## [[1]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_35_P_C2_S14_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[2]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_24_P_C3_S3_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[3]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_40_P_C4_S19_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[4]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_17_P_C5_S17_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[5]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_29_P_C6_S8_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[6]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_28_P_C7_S7_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[7]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_31_P_H3_S10_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[8]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_26_P_H7_S5_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[9]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_3_P_H4_S3_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[10]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_5_P_H5_S5_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
## 
## [[11]]
## [1] "C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/methylation_coverage/BEA21P085_MR_8_P_H9_S8_R1_001_trimmed_bismark_bt2.bismark.cov.gz"
# read the listed files into a methylRawList object making sure the other
# parameters are filled in correctly.
myobj.pancreas <- methRead(pancreas,
                           sample.id=list("pancreas_C2", "pancreas_C3", "pancreas_C4", "pancreas_C5", "pancreas_C6", "pancreas_C7", 
                                          "pancreas_H3", "pancreas_H7","pancreas_H4", "pancreas_H5", "pancreas_H9"),
                           pipeline = "bismarkCoverage",
                           assembly="Xtro10.0",
                           treatment=c(0,0,0,0,0,0,1,1,1,1,1),
                           mincov = 10
)

# Change chromosome names
chr.alias <- read.table("C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/Xtro_chromAlias.txt", header = TRUE)

chr.name1 <- as.list(chr.alias$genbank)
chr.name2 <- as.list(chr.alias$ucsc)

for(i in 1:length(myobj.pancreas)){
  for(x in 1:length(chr.name1)){
  myobj.pancreas[[i]]$chr[myobj.pancreas[[i]]$chr == chr.name1[[x]]] <- chr.name2[[x]]
}}

# Check chromosome names
#levels(as.factor(myobj.pancreas[[1]]$chr))

lapply(myobj.pancreas, FUN = nrow) %>% unlist %>% mean
## [1] 518715.4
lapply(myobj.pancreas, FUN = nrow) %>% unlist %>% sd
## [1] 222825.5
myobj.pancreas.tile <- tileMethylCounts(myobj.pancreas, win.size = 200, step.size = 200)

Methylation percentage / Read coverage per sample

# check number of samples
#myobj.pancreas.tile

# What type of data is stored here?
#head(myobj.pancreas.tile[[1]])

# Get a histogram of the methylation percentage per sample
lapply(myobj.pancreas.tile, function(x) {getMethylationStats(x, plot=TRUE, both.strands=FALSE)})

## [[1]]
## NULL
## 
## [[2]]
## NULL
## 
## [[3]]
## NULL
## 
## [[4]]
## NULL
## 
## [[5]]
## NULL
## 
## [[6]]
## NULL
## 
## [[7]]
## NULL
## 
## [[8]]
## NULL
## 
## [[9]]
## NULL
## 
## [[10]]
## NULL
## 
## [[11]]
## NULL
# Get a histogram of the read coverage per sample
lapply(myobj.pancreas.tile, function(x) {getCoverageStats(x, plot=TRUE, both.strands=FALSE)})

## [[1]]
## NULL
## 
## [[2]]
## NULL
## 
## [[3]]
## NULL
## 
## [[4]]
## NULL
## 
## [[5]]
## NULL
## 
## [[6]]
## NULL
## 
## [[7]]
## NULL
## 
## [[8]]
## NULL
## 
## [[9]]
## NULL
## 
## [[10]]
## NULL
## 
## [[11]]
## NULL

Filtering

Number of CpGs covered:

min <- 4L #min.per.group unite

# Filter step = Discards bases with coverage below 10 reads and above 99.9th percentile
myobj.filt.pancreas <- filterByCoverage(myobj.pancreas.tile,
                               lo.count=10,
                               lo.perc=NULL,
                               hi.count=NULL,
                               hi.perc=99.9)
# Normalization
myobj.filt.norm.pancreas <- normalizeCoverage(myobj.filt.pancreas, method = "median")

# Merge data
meth.pancreas <- methylKit::unite(myobj.filt.norm.pancreas, destrand=FALSE, min.per.group = min) #min.per.group = minimum number of samples per replicate needed to cover a region/base

# Number of bases covered
nrow(meth.pancreas)
## [1] 33308

Further Filtering

Distribution of the per-CpG standard deviation to determine a suitable cutoff

##Further Filtering

# get percent methylation matrix
pm.pancreas=percMethylation(meth.pancreas)

# calculate standard deviation of CpGs
sds.pancreas=matrixStats::rowSds(pm.pancreas, na.rm = TRUE)

# Visualize the distribution of the per-CpG standard deviation
# to determine a suitable cutoff
hist(sds.pancreas, breaks = 100)

# keep only CpG with standard deviations larger than 2%
meth.pancreas <- meth.pancreas[sds.pancreas > 2]

Keep only CpG with standard deviations larger than 2%. This leaves us with this number of CpG regions:

# This leaves us with this number of CpG regions
nrow(meth.pancreas)
## [1] 26522

Data Structure/Outlier Detection

##Data Structure/Outlier Detection

getCorrelation(meth.pancreas,plot=TRUE)
##             pancreas_C2 pancreas_C3 pancreas_C4 pancreas_C5 pancreas_C6
## pancreas_C2   1.0000000   0.8366548   0.8174115   0.7203372   0.8096893
## pancreas_C3   0.8366548   1.0000000   0.8072408   0.7215267   0.8121373
## pancreas_C4   0.8174115   0.8072408   1.0000000   0.6934262   0.7859786
## pancreas_C5   0.7203372   0.7215267   0.6934262   1.0000000   0.6934174
## pancreas_C6   0.8096893   0.8121373   0.7859786   0.6934174   1.0000000
## pancreas_C7   0.8174067   0.8212190   0.8085935   0.7085958   0.7856337
## pancreas_H3   0.8348607   0.8270006   0.8144287   0.7174424   0.8020269
## pancreas_H7   0.8391190   0.8332716   0.8181180   0.7226913   0.7954552
## pancreas_H4   0.7393347   0.7408327   0.7285859   0.6490361   0.7290312
## pancreas_H5   0.7485675   0.7355218   0.7284212   0.6504795   0.7224872
## pancreas_H9   0.7289987   0.7271876   0.7266854   0.6408967   0.7056170
##             pancreas_C7 pancreas_H3 pancreas_H7 pancreas_H4 pancreas_H5
## pancreas_C2   0.8174067   0.8348607   0.8391190   0.7393347   0.7485675
## pancreas_C3   0.8212190   0.8270006   0.8332716   0.7408327   0.7355218
## pancreas_C4   0.8085935   0.8144287   0.8181180   0.7285859   0.7284212
## pancreas_C5   0.7085958   0.7174424   0.7226913   0.6490361   0.6504795
## pancreas_C6   0.7856337   0.8020269   0.7954552   0.7290312   0.7224872
## pancreas_C7   1.0000000   0.8225279   0.8234151   0.7334714   0.7373197
## pancreas_H3   0.8225279   1.0000000   0.8384927   0.7560391   0.7460011
## pancreas_H7   0.8234151   0.8384927   1.0000000   0.7464905   0.7436379
## pancreas_H4   0.7334714   0.7560391   0.7464905   1.0000000   0.6814391
## pancreas_H5   0.7373197   0.7460011   0.7436379   0.6814391   1.0000000
## pancreas_H9   0.7205777   0.7433411   0.7344374   0.6767006   0.6771200
##             pancreas_H9
## pancreas_C2   0.7289987
## pancreas_C3   0.7271876
## pancreas_C4   0.7266854
## pancreas_C5   0.6408967
## pancreas_C6   0.7056170
## pancreas_C7   0.7205777
## pancreas_H3   0.7433411
## pancreas_H7   0.7344374
## pancreas_H4   0.6767006
## pancreas_H5   0.6771200
## pancreas_H9   1.0000000

clusterSamples(meth.pancreas, dist="correlation", method="ward.D", plot=TRUE)

## 
## Call:
## hclust(d = d, method = HCLUST.METHODS[hclust.method])
## 
## Cluster method   : ward.D 
## Distance         : pearson 
## Number of objects: 11
PCASamples(meth.pancreas, comp = c(1,2))

pancreas Differentially Methylated Regions

Histogram p-values, q-values(adjusted for FDR) and % methilation difference between control and treatment

# Test for differential methylation... This might take a few minutes.
myDiff.pancreas <- calculateDiffMeth(meth.pancreas,
                            overdispersion = "MN",
                            adjust="SLIM",
                            test = "Chisq")
## two groups detected:
##  will calculate methylation difference as the difference of
## treatment (group: 1) - control (group: 0)
pancreas.meth.dir <- c("../data/myDiff.pancreas.RData")
save(myDiff.pancreas, ascii=FALSE, file=pancreas.meth.dir)
rm(myDiff.pancreas)
load(pancreas.meth.dir)
myDiff.pancreas
hist(myDiff.pancreas$pvalue)

hist(myDiff.pancreas$qvalue)

hist(myDiff.pancreas$meth.diff)

Overview of percentage hyper and hypo DMRs per chromosome.

meth.cut <- 10
qvalue.cut <- 0.05

# Overview of percentage hyper and hypo CpGs per chromosome.
diffMethPerChr(myDiff.pancreas, meth.cutoff = meth.cut, qvalue.cutoff = qvalue.cut)
## Warning in eval(quote(list(...)), env): NAs introduced by coercion

# get hyper methylated bases and order by qvalue
myDiff10p.hyper.pancreas <- getMethylDiff(myDiff.pancreas,
                                 difference=meth.cut,
                                 qvalue=qvalue.cut,
                                 type="hyper")
myDiff10p.hyper.pancreas <- myDiff10p.hyper.pancreas[order(myDiff10p.hyper.pancreas$qvalue),]

# get hypo methylated bases and order by qvalue
myDiff10p.hypo.pancreas <- getMethylDiff(myDiff.pancreas,
                                difference=meth.cut,
                                qvalue=qvalue.cut,
                                type="hypo")
myDiff10p.hypo.pancreas <- myDiff10p.hypo.pancreas[order(myDiff10p.hypo.pancreas$qvalue),]

List of all pancreas Differentially Methylated Regions

# get all differentially methylated bases and order by qvalue
myDiff10p.pancreas <- getMethylDiff(myDiff.pancreas,
                           difference=meth.cut,
                           qvalue=qvalue.cut)
myDiff10p.pancreas <- myDiff10p.pancreas[order(myDiff10p.pancreas$qvalue),]

myDiff10p.pancreas %>% datatable

CpG Annotation

###CpG annotation

# First load the annotation data; i.e the coordinates of promoters, TSS, intron and exons

ensembl_anot.pancreas <- readTranscriptFeatures("../annotations/Xtro10_ensemble_gtf2bed_ucsc_chr.bed")
## Reading the table...
## Calculating intron coordinates...
## Calculating exon coordinates...
## Calculating TSS coordinates...
## Calculating promoter coordinates...
## Outputting the final GRangesList...
# Annotate hypermethylated CpGs ("target") with promoter/exon/intron
# information ("feature"). This function operates on GRanges objects, so we # first coerce the methylKit object to GRanges.
myDiff10p.anot.pancreas <- annotateWithGeneParts(target = as(myDiff10p.pancreas,"GRanges"),
                                              feature = ensembl_anot.pancreas,
                                              strand = TRUE,
                                              intersect.chr = TRUE)
## intersecting chromosomes...
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279433v1, chrUn_NW_022279350v1
##   - in 'y': chrM
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279433v1, chrUn_NW_022279350v1
##   - in 'y': chrM
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279433v1, chrUn_NW_022279350v1
##   - in 'y': chrM
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrM
##   - in 'y': chrUn_NW_022279433v1, chrUn_NW_022279350v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrM
##   - in 'y': chrUn_NW_022279433v1, chrUn_NW_022279350v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrM
##   - in 'y': chrUn_NW_022279433v1, chrUn_NW_022279350v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).
getTargetAnnotationStats(myDiff10p.anot.pancreas, percentage=FALSE,precedence=TRUE)
##   promoter       exon     intron intergenic 
##         70        123        387        410
# Summary of target set annotation
myDiff10p.anot.pancreas
## Summary of target set annotation with genic parts
## Rows in target set: 990
## -----------------------
## percentage of target features overlapping with annotation:
##   promoter       exon     intron intergenic 
##       7.07      16.97      52.02      41.41
## 
## percentage of target features overlapping with annotation:
## (with promoter > exon > intron precedence):
##   promoter       exon     intron intergenic 
##       7.07      12.42      39.09      41.41
## 
## percentage of annotation boundaries with feature overlap:
## promoter     exon   intron 
##     0.23     0.07     0.26
## 
## summary of distances to the nearest TSS:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    5518   17996   41444   47399  459474
## 
# View the distance to the nearest Transcription Start Site; the target.row column in the output indicates the row number in the initial target set
dist_tss.pancreas <- getAssociationWithTSS(myDiff10p.anot.pancreas)
hist(dist_tss.pancreas$dist.to.feature)

Count of differentially methylated CpGs within promoters,introns or exons

# See whether the differentially methylated CpGs are within promoters,introns or exons; the order is the same as the target set
feature.sum.pancreas <- genomation::getMembers(myDiff10p.anot.pancreas)
sum.pancreas <- apply(feature.sum.pancreas, 2, FUN = sum) %>% t %>% data.frame %>%
  mutate(intergenic = nrow(myDiff10p.pancreas) - sum(.)) %>%
  mutate(total = sum(.))
sum.pancreas
# This can also be summarized for all differentially methylated CpGs
genomation::plotTargetAnnotation(myDiff10p.anot.pancreas, main = "pancreas Differentially Methylated Regions Genomic Features")

Volcano plot with annotated genes associated with DMR

martfile <- c("../data/mart_xtropicalis_gene_ensembl.RData")

load(martfile)

xen.biomart <- read.table("../annotations/xen.biomart.txt", header = TRUE)

# Table overlapping CpG regions with associated genes

library(ChIPpeakAnno)

metadata <- getData(myDiff10p.pancreas)

meth_pos.pancreas <- GRanges(
  seqnames = myDiff10p.pancreas$chr, 
  ranges = IRanges(start = myDiff10p.pancreas$start, end = myDiff10p.pancreas$end),
  mcols = metadata[, c(5:7)]
)

xtro <- getAnnotation(mart, featureType = "TSS")

res.pancreas <- annotatePeakInBatch(meth_pos.pancreas, AnnotationData = xtro, FeatureLocForDistance="TSS", output = "overlapping", maxgap = 2000, multiple = TRUE) %>% data.frame
## Warning in annotatePeakInBatch(meth_pos.pancreas, AnnotationData = xtro, : not all the seqnames of myPeakList is 
##                     in the AnnotationData.
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279347v1, chrUn_NW_022279348v1, chrUn_NW_022279350v1, chrUn_NW_022279353v1, chrUn_NW_022279357v1, chrUn_NW_022279358v1, chrUn_NW_022279362v1, chrUn_NW_022279363v1, chrUn_NW_022279364v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279373v1, chrUn_NW_022279374v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279378v1, chrUn_NW_022279381v1, chrUn_NW_022279382v1, chrUn_NW_022279386v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279393v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279399v1, chrUn_NW_022279400v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279408v1, chrUn_NW_022279409v1, chrUn_NW_022279410v1, chrUn_NW_022279411v1, chrUn_NW_022279415v1, chrUn_NW_022279421v1, chrUn_NW_022279422v1, chrUn_NW_022279424v1, chrUn_NW_022279425v1, chrUn_NW_022279426v1, chrUn_NW_022279427v1, chrUn_NW_022279431v1, chrUn_NW_022279433v1, chrUn_NW_022279437v1, chrUn_NW_022279438v1, chrUn_NW_022279439v1, chrUn_NW_022279440v1, chrUn_NW_022279441v1, chrUn_NW_022279442v1, chrUn_NW_022279444v1, chrUn_NW_022279446v1, chrUn_NW_022279453v1, chrUn_NW_022279457v1, chrUn_NW_022279460v1, chrUn_NW_022279470v1, chrUn_NW_022279471v1, chrUn_NW_022279472v1, chrUn_NW_022279474v1, chrUn_NW_022279483v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1, chrUn_NW_022279502v1, chrUn_NW_022279359v1, chrUn_NW_022279361v1, chrUn_NW_022279398v1, chrUn_NW_022279418v1, chrUn_NW_022279429v1, chrUn_NW_022279443v1, chrUn_NW_022279452v1, chrUn_NW_022279459v1, chrUn_NW_022279461v1, chrUn_NW_022279467v1, chrUn_NW_022279349v1, chrUn_NW_022279355v1, chrUn_NW_022279394v1, chrUn_NW_022279449v1, chrUn_NW_022279479v1, chrUn_NW_022279488v1, chrUn_NW_022279480v1, chrUn_NW_022279485v1, chrUn_NW_022279498v1, chrUn_NW_022279501v1, chrUn_NW_022279448v1, chrUn_NW_022279499v1, chrUn_NW_022279469v1, chrUn_NW_022279369v1, chrUn_NW_022279401v1, chrUn_NW_022279435v1, chrUn_NW_022279456v1, chrUn_NW_022279413v1
##   - in 'y': AAMC04000057.1, AAMC04000136.1, AAMC04000166.1, chrM, AAMC04000105.1, AAMC04000062.1, AAMC04000161.1, AAMC04000165.1, AAMC04000012.1, AAMC04000106.1, AAMC04000158.1, AAMC04000063.1, AAMC04000134.1, AAMC04000095.1, AAMC04000104.1, AAMC04000014.1, AAMC04000097.1, AAMC04000085.1, AAMC04000108.1, AAMC04000060.1, AAMC04000056.1, AAMC04000051.1, AAMC04000046.1, AAMC04000030.1, AAMC04000033.1, AAMC04000050.1, AAMC04000069.1, AAMC04000022.1, AAMC04000037.1, AAMC04000028.1, AAMC04000021.1, AAMC04000032.1, AAMC04000042.1, AAMC04000029.1, AAMC04000011.1, AAMC04000017.1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).
t.pancreas <- res.pancreas %>%
  left_join(
    dplyr::select(xen.biomart, ensembl_gene_id, external_gene_name, description),
    by = c("feature" = "ensembl_gene_id"))

t.pancreas[,c(1:3,6:8,10,14,18:19)] %>% datatable
#plot paramenters
pancreas.plot <- merge(getData(myDiff.pancreas) , t.pancreas, all.x = TRUE) %>%
  mutate(methylation_status = case_when(
    meth.diff > meth.cut & qvalue < qvalue.cut ~ "Hypermethylated",
    meth.diff < -meth.cut & qvalue < qvalue.cut ~ "Hypomethylated",
    TRUE ~ "N.S."
  ))

#ggplot

plot.pancreas <- ggplot(data=pancreas.plot, aes(x=meth.diff, y=-log10(qvalue), col=methylation_status)) +
  geom_point() + 
  theme_minimal() +
  scale_color_manual(values=c("mediumblue", "firebrick", "gray")) +
  geom_vline(xintercept=c(-meth.cut, meth.cut), col="gray57", linetype = "longdash") +
  geom_hline(yintercept=-log10(qvalue.cut), col="gray57", linetype = "longdash") +
  labs(title = "pancreas Differentially Methylated Regions",
    subtitle =paste("qvalue <", as.character(qvalue.cut), "& |methylation difference| >=", 
    as.character(meth.cut), "%")) +
  xlab("% Methylation Change")

plot.pancreas

ggplot(data=pancreas.plot, aes(x=meth.diff, y=-log10(qvalue), col=methylation_status, label = external_gene_name)) +
  geom_point() + 
  theme_minimal() +
  scale_color_manual(values=c("mediumblue", "firebrick", "gray")) +
  geom_vline(xintercept=c(-meth.cut, meth.cut), col="gray57", linetype = "longdash") +
  geom_hline(yintercept=-log10(qvalue.cut), col="gray57", linetype = "longdash") +
  geom_text_repel(show.legend = FALSE) +
  labs(title = "pancreas Differentially Methylated Regions",
    subtitle =paste("qvalue <", as.character(qvalue.cut), "& |methylation difference| >=", 
    as.character(meth.cut), "%")) +
  xlab("% Methylation Change")
## Warning: Removed 25900 rows containing missing values (geom_text_repel).
## Warning: ggrepel: 556 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

CpG regions locations

Sum of CpG regions on islands or shores

# Load the CpG info
cpg_anot <- readFeatureFlank("C:/Users/MauricioRoza/OneDrive/PhD/Linuron multigenerational/Sequencing analysis/CpG_Xtro10.txt", feature.flank.name = c("CpGi", "shores"), flank=2000)

diffCpGann.pancreas <- annotateWithFeatureFlank(as(myDiff10p.pancreas,"GRanges"), feature = cpg_anot$CpGi, flank = cpg_anot$shores, feature.name = "CpGi", flank.name = "shores")
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279433v1
##   - in 'y': chrUn_NW_022279347v1, chrUn_NW_022279355v1, chrUn_NW_022279358v1, chrUn_NW_022279360v1, chrUn_NW_022279361v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279371v1, chrUn_NW_022279373v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279381v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279399v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279405v1, chrUn_NW_022279410v1, chrUn_NW_022279412v1, chrUn_NW_022279419v1, chrUn_NW_022279420v1, chrUn_NW_022279424v1, chrUn_NW_022279429v1, chrUn_NW_022279430v1, chrUn_NW_022279431v1, chrUn_NW_022279434v1, chrUn_NW_022279444v1, chrUn_NW_022279468v1, chrUn_NW_022279477v1, chrUn_NW_022279490v1, chrUn_NW_022279491v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279433v1
##   - in 'y': chrUn_NW_022279347v1, chrUn_NW_022279355v1, chrUn_NW_022279358v1, chrUn_NW_022279360v1, chrUn_NW_022279361v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279371v1, chrUn_NW_022279373v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279381v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279399v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279405v1, chrUn_NW_022279410v1, chrUn_NW_022279412v1, chrUn_NW_022279419v1, chrUn_NW_022279420v1, chrUn_NW_022279424v1, chrUn_NW_022279429v1, chrUn_NW_022279430v1, chrUn_NW_022279431v1, chrUn_NW_022279434v1, chrUn_NW_022279444v1, chrUn_NW_022279468v1, chrUn_NW_022279477v1, chrUn_NW_022279490v1, chrUn_NW_022279491v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279347v1, chrUn_NW_022279355v1, chrUn_NW_022279358v1, chrUn_NW_022279360v1, chrUn_NW_022279361v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279371v1, chrUn_NW_022279373v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279381v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279399v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279405v1, chrUn_NW_022279410v1, chrUn_NW_022279412v1, chrUn_NW_022279419v1, chrUn_NW_022279420v1, chrUn_NW_022279424v1, chrUn_NW_022279429v1, chrUn_NW_022279430v1, chrUn_NW_022279431v1, chrUn_NW_022279434v1, chrUn_NW_022279444v1, chrUn_NW_022279468v1, chrUn_NW_022279477v1, chrUn_NW_022279490v1, chrUn_NW_022279491v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1
##   - in 'y': chrUn_NW_022279433v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279347v1, chrUn_NW_022279355v1, chrUn_NW_022279358v1, chrUn_NW_022279360v1, chrUn_NW_022279361v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279371v1, chrUn_NW_022279373v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279381v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279399v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279405v1, chrUn_NW_022279410v1, chrUn_NW_022279412v1, chrUn_NW_022279419v1, chrUn_NW_022279420v1, chrUn_NW_022279424v1, chrUn_NW_022279429v1, chrUn_NW_022279430v1, chrUn_NW_022279431v1, chrUn_NW_022279434v1, chrUn_NW_022279444v1, chrUn_NW_022279468v1, chrUn_NW_022279477v1, chrUn_NW_022279490v1, chrUn_NW_022279491v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1
##   - in 'y': chrUn_NW_022279433v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).
# See whether the CpG in myDiff10p belong to a CpG Island or Shore
features.cpgi <- genomation::getMembers(diffCpGann.pancreas)
apply(features.cpgi, 2, FUN = sum)
##   CpGi shores 
##     82    136

Percentage of CpG on Islands, shores or open sea

# This can also be summarized for all differentially methylated CpGs
genomation::plotTargetAnnotation(diffCpGann.pancreas, main = "Differential Methylation Annotation")

Differentially Methylated CpG islands

# Summarize the original object counts over a certain region, here the CpG Islands
# You can ignore the warnings here...
myobj_islands.pancreas <- regionCounts(myobj.pancreas, c(cpg_anot$CpGi, cpg_anot$shores))

# Filter the summarized counts by coverage
myobj_islands_filt.pancreas <- filterByCoverage(myobj_islands.pancreas,
                                             lo.count=10,
                                             lo.perc=NULL,
                                             hi.count=NULL,
                                             hi.perc=99.9)
# Perform simple normalization
myobj_islands_filt_norm.pancreas <- normalizeCoverage(myobj_islands_filt.pancreas, method = "median")

Number of CpG Islands

# Merge the samples again
meth_islands.pancreas <- methylKit::unite(myobj_islands_filt_norm.pancreas, destrand=FALSE, min.per.group = min)
nrow(meth_islands.pancreas)

# get percent methylation matrix
pm.pancreas.i=percMethylation(meth_islands.pancreas)

# calculate standard deviation of CpGs
sds.pancreas.i=matrixStats::rowSds(pm.pancreas.i, na.rm = TRUE)

# Visualize the distribution of the per-CpG standard deviation
# to determine a suitable cutoff
hist(sds.pancreas.i, breaks = 100)

# keep only CpG with standard deviations larger than 2%
meth.pancreas.i <- meth_islands.pancreas[sds.pancreas.i > 2]
nrow(meth.pancreas.i)
# Test for differential methylation... This might take a few minutes.
myDiff_islands.pancreas <- calculateDiffMeth(meth.pancreas.i,
                                          overdispersion = "MN",
                                          adjust="SLIM",
                                          test = "Chisq")

# Rank by significance
myDiff_islands.pancreas <- myDiff_islands.pancreas[order(myDiff_islands.pancreas$qvalue),]
# get all differentially methylated CpG Islands
myDiff_islands_10p.pancreas <- getMethylDiff(myDiff_islands.pancreas,difference=meth.cut,qvalue=qvalue.cut)

myDiff_islands_10p_ann.pancreas <- annotateWithGeneParts(as((myDiff_islands_10p.pancreas), "GRanges"), ensembl_anot.pancreas)
# View the distance to the nearest Transcription Start Site; the target.row column indicates the row number in myDiff_islands_10p
getAssociationWithTSS(myDiff_islands_10p_ann.pancreas) %>% datatable

Volcano plot Differentially Methylated CpGislands+shores

########### CpG Islands annotations
# convert to genomic ranges

metadata.i <- getData(myDiff_islands_10p.pancreas)

meth_pos.i <- GRanges(
  seqnames = myDiff_islands_10p.pancreas$chr, 
  ranges = IRanges(start = myDiff_islands_10p.pancreas$start, end = myDiff_islands_10p.pancreas$end),
  mcols = metadata.i[, c(5:7)]
)

res.pancreas.i <- annotatePeakInBatch(meth_pos.i, AnnotationData = xtro, FeatureLocForDistance="TSS", output = "overlapping", multiple = TRUE) %>% data.frame

t.pancreas.i <- res.pancreas.i %>%
  left_join(
    dplyr::select(xen.biomart, ensembl_gene_id, external_gene_name, description),
    by = c("feature" = "ensembl_gene_id"))

t.pancreas.i[,c(6:8,10,14,18:19)] %>% datatable

#plot paramenters
pancreas.plot.i <- merge(getData(myDiff_islands.pancreas) , t.pancreas.i, all.x = TRUE) %>%
  mutate(methylation_status = case_when(
    meth.diff > meth.cut & qvalue < qvalue.cut ~ "Hypermethylated",
    meth.diff < -meth.cut & qvalue < qvalue.cut ~ "Hypomethylated",
    TRUE ~ "N.S."
  ))

ggplot(data=pancreas.plot.i, aes(x=meth.diff, y=-log10(qvalue), col=methylation_status, label = external_gene_name)) +
  geom_point() + 
  theme_minimal() +
  scale_color_manual(values=c("mediumblue", "firebrick", "gray")) +
  geom_vline(xintercept=c(-meth.cut, meth.cut), col="gray57", linetype = "longdash") +
  geom_hline(yintercept=-log10(qvalue.cut), col="gray57", linetype = "longdash") +
  geom_text_repel(show.legend = FALSE) +
  labs(title = "Differentially Methylated CpG Islands",
    subtitle =paste("qvalue <", as.character(qvalue.cut), "& |methylation difference| >=", 
    as.character(meth.cut), "%")) +
  xlab("% Methylation Change")

Differentially Methylated Promoter Regions

# Summarize the original object counts over a certain region, here the promoter
# You can ignore the warnings here...
myobj_promoters.pancreas <- regionCounts(myobj.pancreas, c(ensembl_anot.pancreas$promoters))

# Filter the summarized counts by coverage
myobj_promoters_filt.pancreas <- filterByCoverage(myobj_promoters.pancreas,
                                             lo.count=10,
                                             lo.perc=NULL,
                                             hi.count=NULL,
                                             hi.perc=99.9)
# Perform simple normalization
myobj_promoters_filt_norm.pancreas <- normalizeCoverage(myobj_promoters_filt.pancreas, method = "median")

Number of CpG promoters

# Merge the samples again
meth_promoters.pancreas <- methylKit::unite(myobj_promoters_filt_norm.pancreas, destrand=FALSE, min.per.group = min)
nrow(meth_promoters.pancreas)

# get percent methylation matrix
pm.pancreas.p=percMethylation(meth_promoters.pancreas)

# calculate standard deviation of CpGs
sds.pancreas.p=matrixStats::rowSds(pm.pancreas.p, na.rm = TRUE)

# Visualize the distribution of the per-CpG standard deviation
# to determine a suitable cutoff
hist(sds.pancreas.p, breaks = 100)

# keep only CpG with standard deviations larger than 2%
meth.pancreas.p <- meth_promoters.pancreas[sds.pancreas.p > 2]
nrow(meth.pancreas.p)
# Test for differential methylation... This might take a few minutes.
myDiff_promoters.pancreas <- calculateDiffMeth(meth.pancreas.p,
                                          overdispersion = "MN",
                                          adjust="SLIM",
                                          test = "Chisq")

# Rank by significance
myDiff_promoters.pancreas <- myDiff_promoters.pancreas[order(myDiff_promoters.pancreas$qvalue),]
# get all differentially methylated CpG promoters
myDiff_promoters_10p.pancreas <- getMethylDiff(myDiff_promoters.pancreas,difference=meth.cut,qvalue=qvalue.cut)

myDiff_promoters_10p_ann.pancreas <- annotateWithGeneParts(as((myDiff_promoters_10p.pancreas), "GRanges"), ensembl_anot.pancreas)
# View the distance to the nearest Transcription Start Site; the target.row column indicates the row number in myDiff_promoters_10p
tss.dist.promoters <- (getAssociationWithTSS(myDiff_promoters_10p_ann.pancreas))
hist(tss.dist.promoters$dist.to.feature)

Volcano plot Differentially Methylated Promoters

########### CpG promoters annotations

metadata.p <- getData(myDiff_promoters_10p.pancreas)

meth_pos.p <- GRanges(
  strand = myDiff_promoters_10p.pancreas$strand,
  seqnames = myDiff_promoters_10p.pancreas$chr, 
  ranges = IRanges(start = myDiff_promoters_10p.pancreas$start, end = myDiff_promoters_10p.pancreas$end),
  mcols = metadata.p[, c(5:7)]
)

res.pancreas.p <- annotatePeakInBatch(meth_pos.p, AnnotationData = xtro, FeatureLocForDistance="TSS", output = "overlapping", multiple = TRUE) %>% data.frame

t.pancreas.p <- res.pancreas.p %>%
  left_join(
    dplyr::select(xen.biomart, ensembl_gene_id, external_gene_name, description),
    by = c("feature" = "ensembl_gene_id"))

t.pancreas.p[,c(6:8,10,14,18:19)] %>% datatable

#plot paramenters
pancreas.plot.p <- merge(getData(myDiff_promoters.pancreas) , t.pancreas.p, all.x = TRUE) %>%
  mutate(methylation_status = case_when(
    meth.diff > meth.cut & qvalue < qvalue.cut ~ "Hypermethylated",
    meth.diff < -meth.cut & qvalue < qvalue.cut ~ "Hypomethylated",
    TRUE ~ "N.S."
  ))

ggplot(data=pancreas.plot.p, aes(x=meth.diff, y=-log10(qvalue), col=methylation_status, label = external_gene_name)) +
  geom_point() + 
  theme_minimal() +
  scale_color_manual(values=c("mediumblue", "firebrick", "gray")) +
  geom_vline(xintercept=c(-meth.cut, meth.cut), col="gray57", linetype = "longdash") +
  geom_hline(yintercept=-log10(qvalue.cut), col="gray57", linetype = "longdash") +
  geom_text_repel(show.legend = FALSE) +
  labs(title = "Differentially Methylated Promoters",
    subtitle =paste("qvalue <", as.character(qvalue.cut), "& |methylation difference| >=", 
    as.character(meth.cut), "%")) +
  xlab("% Methylation Change")

Gene Ontology

pancreas Differentially Methylated Regions

library(clusterProfiler)
library(org.Xt.eg.db)

gene.pancreas <- as.character(res.pancreas$feature) %>% na.omit

gene.pancreas %>% unique %>% length
## [1] 637
ggo.pancreas <- groupGO(gene     = gene.pancreas,
               OrgDb    = org.Xt.eg.db,
               keyType = "ENSEMBL",
               ont      = "BP",
               readable = TRUE)

ggo.pancreas@result %>% datatable
ggo.pancreas %>% filter(., Count > 0) %>% arrange(desc(Count)) %>% barplot(., showCategory = 20)

ego.pancreas <- enrichGO(gene          = gene.pancreas,
                OrgDb         = org.Xt.eg.db,
                keyType = "ENSEMBL",
                ont           = "ALL",
                pAdjustMethod = "BH",
                pvalueCutoff = 1,
                qvalueCutoff = 0.05,
        readable      = TRUE)

data.frame(ego.pancreas) %>% datatable
categorys <- c("neuron differentiation", "neurogenesis", "generation of neurons", "neuron development", "regulation of cell differentiation", "neuron projection guidance")

barplot(ego.pancreas, color = "p.adjust", showCategory = 15, title = "GO enrichment pancreas - Diff meth regions/tiles")

#goplot(ego, showCategory = 10)

Differentially Methylated Promoters

gene.pancreas.prom <- as.character(res.pancreas.p$feature)


ggo.pancreas.prom <- groupGO(gene     = gene.pancreas.prom,
               OrgDb    = org.Xt.eg.db,
               keyType = "ENSEMBL",
               ont      = "BP",
               readable = TRUE)

ggo.pancreas.prom@result %>% datatable


ego.pancreas.prom <- enrichGO(gene          = gene.pancreas.prom,
                OrgDb         = org.Xt.eg.db,
                keyType = "ENSEMBL",
                ont           = "ALL",
                pAdjustMethod = "BH",
                pvalueCutoff = 0.05,
                qvalueCutoff = 1,
        readable      = TRUE)
data.frame(ego.pancreas.prom) %>% datatable

categorys <- c("neuron differentiation", "neurogenesis", "generation of neurons", "neuron development", "regulation of cell differentiation", "neuron projection guidance")

barplot(ego.pancreas.prom, color = "p.adjust", showCategory = 10, title = "GO enrichment pancreas - Diff meth promoters")

#goplot(ego,pancreas.prom, showCategory = 10)

Differentially methylated Islands

gene.pancreas.i <- as.character(res.pancreas.i$feature)

ggo.pancreas.i <- groupGO(gene     = gene.pancreas.i,
               OrgDb    = org.Xt.eg.db,
               keyType = "ENSEMBL",
               ont      = "BP",
               readable = TRUE)

ggo.pancreas.i@result %>% datatable


ego.pancreas.i <- enrichGO(gene          = gene.pancreas.i,
                OrgDb         = org.Xt.eg.db,
                keyType = "ENSEMBL",
                ont           = "ALL",
                pAdjustMethod = "BH",
                pvalueCutoff = 0.05,
                qvalueCutoff = 1,
        readable      = TRUE)

data.frame(ego.pancreas.i) %>% datatable

barplot(ego.pancreas.i, color = "p.adjust", showCategory = 10, title = "GO enrichment pancreas - Diff meth islands")

#goplot(ego,pancreas.i, showCategory = 10)

Enrich KEGG

DMRs

entrez.gene.pancreas = getBM(attributes = c('ensembl_gene_id', 'entrezgene_id'), 
              filters = 'ensembl_gene_id', 
              values = res.pancreas$feature, 
              mart = mart)


gene.pancreas <- as.character(entrez.gene.pancreas$entrezgene_id) %>% na.omit

kk.pancreas <- enrichKEGG(gene         = gene.pancreas,
                 organism     = 'xtr',
                 pvalueCutoff = 0.05,
                 pAdjustMethod = "BH",
                 use_internal_data = FALSE)
## Reading KEGG annotation online:
## Warning in utils::download.file(url, quiet = TRUE, method = method, ...): the
## 'wininet' method is deprecated for http:// and https:// URLs
## Reading KEGG annotation online:
## Warning in utils::download.file(url, quiet = TRUE, method = method, ...): the
## 'wininet' method is deprecated for http:// and https:// URLs
## --> No gene can be mapped....
## --> Expected input gene ID:
## --> return NULL...
data.frame(kk.pancreas) %>% datatable
barplot(kk.pancreas, showCategory = 10, title = "KEGG pathways pancreas - pancreas Differentially Methylated Regions (tiles)")
## Error in barplot.default(kk.pancreas, showCategory = 10, title = "KEGG pathways pancreas - pancreas Differentially Methylated Regions (tiles)"): 'height' must be a vector or a matrix

Differentially methylated promoters

entrez.gene.pancreas.p = getBM(attributes = c('ensembl_gene_id', 'entrezgene_id'), 
              filters = 'ensembl_gene_id', 
              values = res.pancreas.p$feature, 
              mart = mart)

entrez.gene.pancreas.p = entrez.gene.pancreas.p %>% filter(duplicated(ensembl_gene_id) == FALSE)

gene.pancreas.p <- as.character(entrez.gene.pancreas.p$entrezgene_id) %>% na.omit

kk.pancreas.prom <- enrichKEGG(gene         = gene.pancreas.p,
                 organism     = 'xtr',
                 pvalueCutoff = 1,
                 pAdjustMethod = "BH",
                 use_internal_data = FALSE)

data.frame(kk.pancreas.prom) %>% datatable

barplot(kk.pancreas.prom, showCategory = 10, title = "KEGG pathways pancreas - Differentially methylated promoters")

Differentially methylated Islands

entrez.gene.pancreas.i = getBM(attributes = c('ensembl_gene_id', 'entrezgene_id'), 
              filters = 'ensembl_gene_id', 
              values = res.pancreas.i$feature, 
              mart = mart)

gene.pancreas.i <- as.character(entrez.gene.pancreas.i$entrezgene_id) %>% na.omit

kk.pancreas.i <- enrichKEGG(gene         = gene.pancreas.i,
                 organism     = 'xtr',
                 pvalueCutoff = 0.05,
                 pAdjustMethod = "BH",
                 use_internal_data = FALSE)

data.frame(kk.pancreas.i) %>% datatable

barplot(kk.pancreas.i, showCategory = 10)

Relevant genes tables

rlv.genes.pancreas <- c("ep300", "elp3", "kat5", "kat14", #histone acetyltransferase
                      "lhcgr", "hsd17b12", "esrrg",
                      "piwil1", "mael", "spo11", "\\bddx4\\b", #spermatogenesis, gonadal development
                      "dnmt3a"
)

rlv.genes.table.pancreas <- t.pancreas %>%
   filter(str_detect(external_gene_name, paste(rlv.genes.pancreas, collapse = "|")))

geneparts.anot.pancreas <- annotateWithGeneParts(target = as(rlv.genes.table.pancreas,"GRanges"),
                                              feature = ensembl_anot.pancreas, intersect.chr = TRUE, strand = TRUE)
## intersecting chromosomes...
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279347v1, chrUn_NW_022279348v1, chrUn_NW_022279349v1, chrUn_NW_022279350v1, chrUn_NW_022279353v1, chrUn_NW_022279355v1, chrUn_NW_022279357v1, chrUn_NW_022279358v1, chrUn_NW_022279359v1, chrUn_NW_022279361v1, chrUn_NW_022279362v1, chrUn_NW_022279363v1, chrUn_NW_022279364v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279373v1, chrUn_NW_022279374v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279378v1, chrUn_NW_022279381v1, chrUn_NW_022279382v1, chrUn_NW_022279386v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279393v1, chrUn_NW_022279394v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279398v1, chrUn_NW_022279399v1, chrUn_NW_022279400v1, chrUn_NW_022279401v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279408v1, chrUn_NW_022279409v1, chrUn_NW_022279410v1, chrUn_NW_022279411v1, chrUn_NW_022279413v1, chrUn_NW_022279415v1, chrUn_NW_022279418v1, chrUn_NW_022279421v1, chrUn_NW_022279422v1, chrUn_NW_022279424v1, chrUn_NW_022279425v1, chrUn_NW_022279426v1, chrUn_NW_022279427v1, chrUn_NW_022279429v1, chrUn_NW_022279431v1, chrUn_NW_022279433v1, chrUn_NW_022279435v1, chrUn_NW_022279437v1, chrUn_NW_022279438v1, chrUn_NW_022279439v1, chrUn_NW_022279440v1, chrUn_NW_022279441v1, chrUn_NW_022279442v1, chrUn_NW_022279443v1, chrUn_NW_022279444v1, chrUn_NW_022279446v1, chrUn_NW_022279448v1, chrUn_NW_022279449v1, chrUn_NW_022279452v1, chrUn_NW_022279453v1, chrUn_NW_022279456v1, chrUn_NW_022279457v1, chrUn_NW_022279459v1, chrUn_NW_022279460v1, chrUn_NW_022279461v1, chrUn_NW_022279467v1, chrUn_NW_022279469v1, chrUn_NW_022279470v1, chrUn_NW_022279471v1, chrUn_NW_022279472v1, chrUn_NW_022279474v1, chrUn_NW_022279479v1, chrUn_NW_022279480v1, chrUn_NW_022279483v1, chrUn_NW_022279485v1, chrUn_NW_022279488v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1, chrUn_NW_022279498v1, chrUn_NW_022279499v1, chrUn_NW_022279501v1, chrUn_NW_022279502v1
##   - in 'y': chrM
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279347v1, chrUn_NW_022279348v1, chrUn_NW_022279349v1, chrUn_NW_022279350v1, chrUn_NW_022279353v1, chrUn_NW_022279355v1, chrUn_NW_022279357v1, chrUn_NW_022279358v1, chrUn_NW_022279359v1, chrUn_NW_022279361v1, chrUn_NW_022279362v1, chrUn_NW_022279363v1, chrUn_NW_022279364v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279373v1, chrUn_NW_022279374v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279378v1, chrUn_NW_022279381v1, chrUn_NW_022279382v1, chrUn_NW_022279386v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279393v1, chrUn_NW_022279394v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279398v1, chrUn_NW_022279399v1, chrUn_NW_022279400v1, chrUn_NW_022279401v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279408v1, chrUn_NW_022279409v1, chrUn_NW_022279410v1, chrUn_NW_022279411v1, chrUn_NW_022279413v1, chrUn_NW_022279415v1, chrUn_NW_022279418v1, chrUn_NW_022279421v1, chrUn_NW_022279422v1, chrUn_NW_022279424v1, chrUn_NW_022279425v1, chrUn_NW_022279426v1, chrUn_NW_022279427v1, chrUn_NW_022279429v1, chrUn_NW_022279431v1, chrUn_NW_022279433v1, chrUn_NW_022279435v1, chrUn_NW_022279437v1, chrUn_NW_022279438v1, chrUn_NW_022279439v1, chrUn_NW_022279440v1, chrUn_NW_022279441v1, chrUn_NW_022279442v1, chrUn_NW_022279443v1, chrUn_NW_022279444v1, chrUn_NW_022279446v1, chrUn_NW_022279448v1, chrUn_NW_022279449v1, chrUn_NW_022279452v1, chrUn_NW_022279453v1, chrUn_NW_022279456v1, chrUn_NW_022279457v1, chrUn_NW_022279459v1, chrUn_NW_022279460v1, chrUn_NW_022279461v1, chrUn_NW_022279467v1, chrUn_NW_022279469v1, chrUn_NW_022279470v1, chrUn_NW_022279471v1, chrUn_NW_022279472v1, chrUn_NW_022279474v1, chrUn_NW_022279479v1, chrUn_NW_022279480v1, chrUn_NW_022279483v1, chrUn_NW_022279485v1, chrUn_NW_022279488v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1, chrUn_NW_022279498v1, chrUn_NW_022279499v1, chrUn_NW_022279501v1, chrUn_NW_022279502v1
##   - in 'y': chrM
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrUn_NW_022279347v1, chrUn_NW_022279348v1, chrUn_NW_022279349v1, chrUn_NW_022279350v1, chrUn_NW_022279353v1, chrUn_NW_022279355v1, chrUn_NW_022279357v1, chrUn_NW_022279358v1, chrUn_NW_022279359v1, chrUn_NW_022279361v1, chrUn_NW_022279362v1, chrUn_NW_022279363v1, chrUn_NW_022279364v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279373v1, chrUn_NW_022279374v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279378v1, chrUn_NW_022279381v1, chrUn_NW_022279382v1, chrUn_NW_022279386v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279393v1, chrUn_NW_022279394v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279398v1, chrUn_NW_022279399v1, chrUn_NW_022279400v1, chrUn_NW_022279401v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279408v1, chrUn_NW_022279409v1, chrUn_NW_022279410v1, chrUn_NW_022279411v1, chrUn_NW_022279413v1, chrUn_NW_022279415v1, chrUn_NW_022279418v1, chrUn_NW_022279421v1, chrUn_NW_022279422v1, chrUn_NW_022279424v1, chrUn_NW_022279425v1, chrUn_NW_022279426v1, chrUn_NW_022279427v1, chrUn_NW_022279429v1, chrUn_NW_022279431v1, chrUn_NW_022279433v1, chrUn_NW_022279435v1, chrUn_NW_022279437v1, chrUn_NW_022279438v1, chrUn_NW_022279439v1, chrUn_NW_022279440v1, chrUn_NW_022279441v1, chrUn_NW_022279442v1, chrUn_NW_022279443v1, chrUn_NW_022279444v1, chrUn_NW_022279446v1, chrUn_NW_022279448v1, chrUn_NW_022279449v1, chrUn_NW_022279452v1, chrUn_NW_022279453v1, chrUn_NW_022279456v1, chrUn_NW_022279457v1, chrUn_NW_022279459v1, chrUn_NW_022279460v1, chrUn_NW_022279461v1, chrUn_NW_022279467v1, chrUn_NW_022279469v1, chrUn_NW_022279470v1, chrUn_NW_022279471v1, chrUn_NW_022279472v1, chrUn_NW_022279474v1, chrUn_NW_022279479v1, chrUn_NW_022279480v1, chrUn_NW_022279483v1, chrUn_NW_022279485v1, chrUn_NW_022279488v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1, chrUn_NW_022279498v1, chrUn_NW_022279499v1, chrUn_NW_022279501v1, chrUn_NW_022279502v1
##   - in 'y': chrM
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrM
##   - in 'y': chrUn_NW_022279347v1, chrUn_NW_022279348v1, chrUn_NW_022279349v1, chrUn_NW_022279350v1, chrUn_NW_022279353v1, chrUn_NW_022279355v1, chrUn_NW_022279357v1, chrUn_NW_022279358v1, chrUn_NW_022279359v1, chrUn_NW_022279361v1, chrUn_NW_022279362v1, chrUn_NW_022279363v1, chrUn_NW_022279364v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279373v1, chrUn_NW_022279374v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279378v1, chrUn_NW_022279381v1, chrUn_NW_022279382v1, chrUn_NW_022279386v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279393v1, chrUn_NW_022279394v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279398v1, chrUn_NW_022279399v1, chrUn_NW_022279400v1, chrUn_NW_022279401v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279408v1, chrUn_NW_022279409v1, chrUn_NW_022279410v1, chrUn_NW_022279411v1, chrUn_NW_022279413v1, chrUn_NW_022279415v1, chrUn_NW_022279418v1, chrUn_NW_022279421v1, chrUn_NW_022279422v1, chrUn_NW_022279424v1, chrUn_NW_022279425v1, chrUn_NW_022279426v1, chrUn_NW_022279427v1, chrUn_NW_022279429v1, chrUn_NW_022279431v1, chrUn_NW_022279433v1, chrUn_NW_022279435v1, chrUn_NW_022279437v1, chrUn_NW_022279438v1, chrUn_NW_022279439v1, chrUn_NW_022279440v1, chrUn_NW_022279441v1, chrUn_NW_022279442v1, chrUn_NW_022279443v1, chrUn_NW_022279444v1, chrUn_NW_022279446v1, chrUn_NW_022279448v1, chrUn_NW_022279449v1, chrUn_NW_022279452v1, chrUn_NW_022279453v1, chrUn_NW_022279456v1, chrUn_NW_022279457v1, chrUn_NW_022279459v1, chrUn_NW_022279460v1, chrUn_NW_022279461v1, chrUn_NW_022279467v1, chrUn_NW_022279469v1, chrUn_NW_022279470v1, chrUn_NW_022279471v1, chrUn_NW_022279472v1, chrUn_NW_022279474v1, chrUn_NW_022279479v1, chrUn_NW_022279480v1, chrUn_NW_022279483v1, chrUn_NW_022279485v1, chrUn_NW_022279488v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1, chrUn_NW_022279498v1, chrUn_NW_022279499v1, chrUn_NW_022279501v1, chrUn_NW_022279502v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrM
##   - in 'y': chrUn_NW_022279347v1, chrUn_NW_022279348v1, chrUn_NW_022279349v1, chrUn_NW_022279350v1, chrUn_NW_022279353v1, chrUn_NW_022279355v1, chrUn_NW_022279357v1, chrUn_NW_022279358v1, chrUn_NW_022279359v1, chrUn_NW_022279361v1, chrUn_NW_022279362v1, chrUn_NW_022279363v1, chrUn_NW_022279364v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279373v1, chrUn_NW_022279374v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279378v1, chrUn_NW_022279381v1, chrUn_NW_022279382v1, chrUn_NW_022279386v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279393v1, chrUn_NW_022279394v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279398v1, chrUn_NW_022279399v1, chrUn_NW_022279400v1, chrUn_NW_022279401v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279408v1, chrUn_NW_022279409v1, chrUn_NW_022279410v1, chrUn_NW_022279411v1, chrUn_NW_022279413v1, chrUn_NW_022279415v1, chrUn_NW_022279418v1, chrUn_NW_022279421v1, chrUn_NW_022279422v1, chrUn_NW_022279424v1, chrUn_NW_022279425v1, chrUn_NW_022279426v1, chrUn_NW_022279427v1, chrUn_NW_022279429v1, chrUn_NW_022279431v1, chrUn_NW_022279433v1, chrUn_NW_022279435v1, chrUn_NW_022279437v1, chrUn_NW_022279438v1, chrUn_NW_022279439v1, chrUn_NW_022279440v1, chrUn_NW_022279441v1, chrUn_NW_022279442v1, chrUn_NW_022279443v1, chrUn_NW_022279444v1, chrUn_NW_022279446v1, chrUn_NW_022279448v1, chrUn_NW_022279449v1, chrUn_NW_022279452v1, chrUn_NW_022279453v1, chrUn_NW_022279456v1, chrUn_NW_022279457v1, chrUn_NW_022279459v1, chrUn_NW_022279460v1, chrUn_NW_022279461v1, chrUn_NW_022279467v1, chrUn_NW_022279469v1, chrUn_NW_022279470v1, chrUn_NW_022279471v1, chrUn_NW_022279472v1, chrUn_NW_022279474v1, chrUn_NW_022279479v1, chrUn_NW_022279480v1, chrUn_NW_022279483v1, chrUn_NW_022279485v1, chrUn_NW_022279488v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1, chrUn_NW_022279498v1, chrUn_NW_022279499v1, chrUn_NW_022279501v1, chrUn_NW_022279502v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).

## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
##   - in 'x': chrM
##   - in 'y': chrUn_NW_022279347v1, chrUn_NW_022279348v1, chrUn_NW_022279349v1, chrUn_NW_022279350v1, chrUn_NW_022279353v1, chrUn_NW_022279355v1, chrUn_NW_022279357v1, chrUn_NW_022279358v1, chrUn_NW_022279359v1, chrUn_NW_022279361v1, chrUn_NW_022279362v1, chrUn_NW_022279363v1, chrUn_NW_022279364v1, chrUn_NW_022279365v1, chrUn_NW_022279366v1, chrUn_NW_022279367v1, chrUn_NW_022279368v1, chrUn_NW_022279369v1, chrUn_NW_022279373v1, chrUn_NW_022279374v1, chrUn_NW_022279375v1, chrUn_NW_022279376v1, chrUn_NW_022279378v1, chrUn_NW_022279381v1, chrUn_NW_022279382v1, chrUn_NW_022279386v1, chrUn_NW_022279387v1, chrUn_NW_022279388v1, chrUn_NW_022279390v1, chrUn_NW_022279391v1, chrUn_NW_022279392v1, chrUn_NW_022279393v1, chrUn_NW_022279394v1, chrUn_NW_022279396v1, chrUn_NW_022279397v1, chrUn_NW_022279398v1, chrUn_NW_022279399v1, chrUn_NW_022279400v1, chrUn_NW_022279401v1, chrUn_NW_022279402v1, chrUn_NW_022279404v1, chrUn_NW_022279408v1, chrUn_NW_022279409v1, chrUn_NW_022279410v1, chrUn_NW_022279411v1, chrUn_NW_022279413v1, chrUn_NW_022279415v1, chrUn_NW_022279418v1, chrUn_NW_022279421v1, chrUn_NW_022279422v1, chrUn_NW_022279424v1, chrUn_NW_022279425v1, chrUn_NW_022279426v1, chrUn_NW_022279427v1, chrUn_NW_022279429v1, chrUn_NW_022279431v1, chrUn_NW_022279433v1, chrUn_NW_022279435v1, chrUn_NW_022279437v1, chrUn_NW_022279438v1, chrUn_NW_022279439v1, chrUn_NW_022279440v1, chrUn_NW_022279441v1, chrUn_NW_022279442v1, chrUn_NW_022279443v1, chrUn_NW_022279444v1, chrUn_NW_022279446v1, chrUn_NW_022279448v1, chrUn_NW_022279449v1, chrUn_NW_022279452v1, chrUn_NW_022279453v1, chrUn_NW_022279456v1, chrUn_NW_022279457v1, chrUn_NW_022279459v1, chrUn_NW_022279460v1, chrUn_NW_022279461v1, chrUn_NW_022279467v1, chrUn_NW_022279469v1, chrUn_NW_022279470v1, chrUn_NW_022279471v1, chrUn_NW_022279472v1, chrUn_NW_022279474v1, chrUn_NW_022279479v1, chrUn_NW_022279480v1, chrUn_NW_022279483v1, chrUn_NW_022279485v1, chrUn_NW_022279488v1, chrUn_NW_022279494v1, chrUn_NW_022279497v1, chrUn_NW_022279498v1, chrUn_NW_022279499v1, chrUn_NW_022279501v1, chrUn_NW_022279502v1
##   Make sure to always combine/compare objects based on the same reference
##   genome (use suppressWarnings() to suppress this warning).
feature.pancreas <- genomation::getMembers(geneparts.anot.pancreas)

feature.pancreas <- feature.pancreas %>% data.frame %>%
  mutate(location = case_when(
    prom == 1 ~ "promoter",
    exon == 1 ~ "exon",
    intron == 1 ~ "intron",
    TRUE ~ "intergenic"
  ))

rlv.genes.table.pancreas <- rlv.genes.table.pancreas %>%
  bind_cols(DMR_location = feature.pancreas[,4])

rlv.genes.table.pancreas.final <- rlv.genes.table.pancreas %>%
  dplyr::select(c(gene = "external_gene_name",
                "DMR_location",
                methylation_difference = "mcols.meth.diff",
                q.value = "mcols.qvalue",
                chromosome = "seqnames",
                "start", "end",
                strand = "feature_strand",
                "description"
                )) %>%
      arrange(factor(gene, levels = c(
                      "piwil1", "mael", "spo11", "ddx4", #spermatogenesis, gonadal development
                      "dnmt3a", "ep300", "elp3", "kat5", "kat14", #histone acetyltransferase, methylation
                      "hsd17b12", "esrrg")))

rlv.genes.table.pancreas.final$description <- str_replace(rlv.genes.table.pancreas.final$description , " \\s*\\[[^\\)]+\\]", "")

Supplementary tables

library(writexl)

DMR_genes <- list("pancreas" = t.pancreas)
GO_terms <- list("pancreas" = ggo.pancreas@result)
GO_enrich <- list("pancreas" = data.frame(ego.pancreas))
KEGG_enrich <- list("Brain" = df.kk.brain)

write_xlsx(DMR_genes, path = "../supplementary_material_tables/S1_DMR_and_overlapping_genes.xlsx")
write_xlsx(GO_terms, path = "../supplementary_material_tables/S2_GO_terms.xlsx")
write_xlsx(GO_enrich, path = "../supplementary_material_tables/S3_GO_enrichment.xlsx")
write_xlsx(KEGG_enrich, path = "../supplementary_material_tables/S4_KEGG_enrichment.xlsx")



write_excel_csv(t.pancreas, file = "../supplementary_material_tables/S2_pancreas_DMR_and_overlapping_genes.csv", quote = "none", delim = '\t', col_names = TRUE, na = "", eol = "\r\n")


write_excel_csv(ggo.pancreas@result, file = "../supplementary_material_tables/S4_pancreas_GO_terms.csv", quote = "none", delim = '\t', col_names = TRUE, na = "", eol = "\r\n")

write_excel_csv(data.frame(ego.pancreas), file = "../supplementary_material_tables/S6_pancreas_GO_gene_enrichment.csv", quote = "none", delim = '\t', col_names = TRUE, na = "", eol = "\r\n")

Paper plots

#volcano plot showing important genes

#pancreas
important.genes.pancreas <- data.frame(
  genes = c("ep300", "elp3", "kat5", "kat14", #histone acetyltransferase
                      "hsd17b12", "esrrg",
                      "piwil1", "mael", "spo11", "\\bddx4\\b", #spermatogenesis, gonadal development
                      "dnmt3a"
)
) %>% mutate(genes1 = genes)


volcano.gene.labels.pancreas <- merge(pancreas.plot, important.genes.pancreas, by.x="external_gene_name", by.y = "genes1", all.x = TRUE)


  
imp.genes.plot.pancreas <- ggplot(data=volcano.gene.labels.pancreas, aes(x=meth.diff, y=-log10(qvalue), col=methylation_status, label = genes)) +
  geom_point() + 
  theme_minimal() +
  scale_color_manual(values=c("mediumblue", "firebrick", "gray")) +
  geom_vline(xintercept=c(-meth.cut, meth.cut), col="gray57", linetype = "longdash") +
  geom_hline(yintercept=-log10(qvalue.cut), col="gray57", linetype = "longdash") +
  geom_text_repel(data = subset(volcano.gene.labels.pancreas, methylation_status != "N.S."),
                  color = "black",
                    show.legend = FALSE, 
                  nudge_y = 50, max.overlaps = 20, size=5, force = 5,
                  box.padding = 2,
                  segment.colour = "gray") +
  labs(title = "pancreas Differentially Methylated Regions",
    subtitle =paste("qvalue <", as.character(qvalue.cut), "& |methylation difference| >=", 
    as.character(meth.cut), "%")) +
  xlab("% Methylation Change") +
  ylim(0, 95)

imp.genes.plot.pancreas
## Warning: Removed 1084 rows containing missing values (geom_text_repel).

ggsave("../figures/volcano_pancreas.tiff", 
       imp.genes.plot.pancreas,  
       scale = 1.5, 
       dpi = 300,
       width = 15,
       height = 10,
       units = "cm")
## Warning: Removed 1084 rows containing missing values (geom_text_repel).
# List all objects in the R environment
objects <- ls()

# Loop over all objects and remove only those larger than 100MB
for (object in objects) {
  size_mb <- object.size(get(object)) / 2^20  # Calculate size in MB
  if (size_mb > 50) {
    message(paste("Removing object", object, "of size", size_mb, "MB"))
    rm(list=object, envir=.GlobalEnv)  # Remove object from global environment
  }
}